Effects of mobility in a population of Prisoner's Dilemma players 
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We address the problem of how the survival of cooperation in a social system depends on the motion of 
the individuals. Specifically, we study a model in which Prisoner's Dilemma players are allowed to move in a 
two-dimensional plane. Our results show that cooperation can survive in such a system provided that both the 
temptation to defect and the velocity at which agents move are not too high. Moreover, we show that when 
these conditions are fulfilled, the only asymptotic state of the system is that in which all players are cooperators. 
Our results might have implications for the design of cooperative strategies in motion coordination and other 
applications including wireless networks. 
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An open question in biology and social sciences is to un- 
derstand how cooperation emerges in a population of selfish 
individuals. A theoretical framework that has shed some light 
into this long-standing problem is evolutionary game theory 
in 01 ■ Through the development and the study of different 
social dilemmas, scientists have been able to elucidate some 
of the mechanisms that enable cooperative behavior in pop- 
ulations. In particular, one of the most studied games is the 
Prisoner's Dilemma (PD), a two-players game in which each 
individual can only adopt one of the two available strategies: 
cooperation (C) or defection (D). While a population of in- 
dividuals playing a PD game does not support cooperation if 
they are well-mixed, the existence of a spatial structure gives 
as a result that cooperation survives under certain^conditions 
as cooperative clusters can emerge in the system [1, 2]. 

In the last years, the field has been spurred by new discov- 
eries on the actual structure of the systems to which evolution- 
ary models are applied. It turns out that in the vast majority 
of real-world networks of interactions the probability that 
an individual has k contacts follows a power-law distribution 
P{k) ~ being 7 an exponent that usually lies between 2 
and 3. Examples of these so-called scale-free (SF) networks 
can be found in almost every field of science |3|. An alterna- 
tive to a power-law distribution is a network of contacts that 
approaches an exponential tail for k larger than the average 
connectivity in the population, being the Erdos-Renyi (ER) 
network the benchmark of this kind of distribution |3!j] . 

Recent works have shown that cooperative behavior is ac- 
tually enhanced when individuals play on complex networks, 
particularly if the network of contacts is scale-free flSSlTtl- 
The reason is that cooperators are fixed in the highly con- 
nected nodes, turning also into cooperators their neighbor- 
hood and guaranteeing in this way their long-time success. 
Additionally, several works have explored different rewiring 
mechanisms that allow an improvement in the average level of 



1 01 lull . In contrast, coopera- 



cooperation in the system 
tion can also be promoted without invoking different rewiring 
rules il2lll3ll . Interestingly, social dilemmas can also be used 
to generate highly cooperative networks by implementing a 
growth mechanism in which the newcomers are attracted to 
already existing nodes with a probability that depends on the 
nodes' benefits ll4 l. 

In spite of the relative large body of work that has been 
accumulated in the last few years, there are situations of prac- 
tical relevance that remain less explored. This is the case of 
models where individuals can move and change their neigh- 
borhood continuously by encountering different game's part- 
ners as time goes on. Highly changing environments can be 
found in a number of social situations and the study of how 
cooperative levels are affected by the inherent mobility of 
the system's constituents can shed light on the general ques- 
tion of how cooperation emerges. Furthermore, the insight 
gained can be used to design cooperation-based protocols for 
communication between wireless devices such as robots ifisll . 
Recently, a few works have dealt with this kind of situation 
However, the models were limited to the case 
in which individuals are allowed to move on the sites of a 2D 
regular lattice. In this paper, we consider the less-constrained 
case in which a set of Prisoner's Dilemma players uncondi- 
tionally move on a two dimensional plane. We explore under 
which conditions cooperation is sustained. In particular, we 
inspect the robustness of the average level of cooperation in 
the population under variation of the game parameters and of 
the mobility rules. Our results show that cooperation is actu- 
ally promoted provided that players do not move too fast and 
that cooperation is not too expensive. Additionally, at vari- 
ance with other cases, the dynamics of the system exhibits 
only two stable attractors -those in which the whole popula- 
tion plays with one of the two possible strategies. 

In our model, we consider N agents (individuals) moving 
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in a square plane of size L with periodic boundary conditions, 
and playing a game on the instantaneous network of contacts. 
The three main ingredients of the model are: the rules of the 
motion, the definition of the graph of interactions, and the 
rules of the evolutionary game. 

Motion. Each agent moves at time t with a ve- 

locity Vi(t) {i = l,2,...,iV). We assume that individ- 
uals can only change their direction of motion, Oi{t), but 
not their speed which is constant in time, and equal for all 
the agents. Hence we can write the velocities as; Vi(t) = 
{v cos 6i{t),v sin 9i{t)). The individuals are initially assigned 
a random position in the square and a random direction of mo- 
tion. At each time step they update their positions and velocity 
according to the following dynamical rules: 



X,(t+l)=Xi(t)+Vi(t) 



(1) 



(2) 



where Xi{t) is the position of the i-th agent in the plane at 
time t and r/i are N independent random variables chosen at 
each time with uniform probability in the interval [— tt; tt]. 

Network of interactions. At each time step we consider 
that the neighborhood of a given agent i is made up by all the 
individuals j which are within an Euclidean distance dij less 
than some threshold r. In what follows, without loss of gener- 
ality, we set 7' = 1. Therefore, the instant network of contacts 
is defined as the graph formed by nodes centered at all the N 
circles of radius 1 together with the links between those agents 
in the neighborhood of each individual. Note that as agents 
move every time step, the network of contacts, and hence the 
adjacency matrix of the graph is continuously changing, not 
only because the number of contacts an individual has may 
change, but also due to the fact that the neighbors are not al- 
ways the same. The topological features of the graph defined 
above depend on several parameters. For instance, the mean 
degree of the graph can be written as (k) = pnr'^ ~ pn where 
p = N / L"^ is the density of agents. For small values of p, 
the graph is composed by several components and there may 
also exist isolated individuals. On the contrary, when p > pc 
a unique giant component appears [2(|| (for our system with 
periodic boundary conditions pc ~ 1.43). 

Evolutionary dynamics. As the rules governing the evo- 
lutionary dynamics, we assume that individuals interact by 
playing the Prisoner's Dilemma (PD) game. Initially, players 
adopt one of the two available strategies, namely to cooperate 
or to defect, with the same probability 1/2. At every round 
of the game all the agents play once with all their correspond- 
ing instant neighbors. The results of a game translate into the 
following payoffs; both agents receive R under mutual co- 
operation and P under mutual defection, while a cooperator 
receives S when confronted to a defector, which in turn re- 
ceives T. These four payoffs are ordered as T>R>P>S 
in the PD game so that defection is the best choice, regard- 
less of the opponent strategy. As usual in recent studies, we 
choose the PD payoffs as i? = 1, P = 5 = 0, and T = & > 1. 



Once the agents have played with all their neighbors, they ac- 
cumulate the payoffs obtained in each game, and depending 
on their total payoffs and on the payoffs of the first neighbors, 
they decide whether or not to keep playing with the same strat- 
egy for the next round robin. In this process, an agent i picks 
up at random one of its neighbors, say j, and compare their 
respective payoffs Pi and Pj. If P,; > Pj, nothing happens 
and i keeps playing with the same strategy. On the contrary, 
if Pj > Pi , agent i adopts the strategy of j with a probability 
proportional to the payoff difference: 



n, 



Pi 



maxjfcj, ki}b 



(3) 



where ki and kj are the number of instant neighbors of i and 
j respectively {i.e. the number of agents inside the circles 
of radius r centered at i and j respectively). This process of 
strategy updating is done synchronously for all the agents of 
the system and is a finite population analogue of replicator 
dynamics. When finished, the payoffs are reset to zero, so 
that repeated games are not considered. 

The movement and game dynamics might in general be cor- 
related, and the influence of the agents movement on the per- 
formance of the PD dynamics depends on the ratio between 
their corresponding time scales. Here, we consider the situa- 
tion in which both movement and evolutionary dynamics have 
the same time scale. Therefore, at each time step, the fol- 
lowing sequence is performed: (i) the agents perform a new 
movement in the two-dimensional space, ( ii) establish the new 
network of contacts (determined by the radius r of interaction) 
and ( Hi) they play a round of the PD game, accumulating the 
payoffs and finally updating their corresponding strategies ac- 
cordingly. After this latter step, the players move again. The 
process is repeated until a stationary state is reached. Here, a 
stationary state is one in which no further changes of strategies 
are possible. 

We have performed extensive numerical simulations of the 
model for various values of the agent density p and velocity v, 
and different values of the game parameter b. Let us first note 
that for the limiting case in which v = 0, the results point out 
that the average level of cooperation is different from zero, as 
one might expect from the fact that the underlying network of 
contacts has a Poisson degree distribution. Indeed, the graph 
corresponds to a random geometric graph [j20|], a network hav- 
ing the same P(fc) as an ER random graph, but with a higher 
clustering coefficient. This latter feature leads to a further in- 
crement of the average level of cooperation, as it has been 
shown that a network with a high clustering coefficient pro- 
motes cooperation 112 1112211 . 

Let us now focus on the case v ^ 0. The first difference 
that arises with respect to the case in which agents do not 
move is that the dynamics of the system only have two attrac- 
tors. Namely, the asymptotic state (i.e., when the probability 
that any player changes its strategy is zero) is either a fully 
cooperative network (all-C) or a network in which all the in- 
dividuals end up playing as defectors (all-D). This behavior is 
illustrated in Fig.lT] where we have reported the average level 
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FIG. 1: Average level of cooperation, (c), as a function of time 
(Monte Carlo steps) for v — 0.01 and two different values of b, 
b — 1.1 b = 1.3, as indicated. Other model parameters have been 
fixed to p = 1.30 and iV = 10^ agents. 
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FIG. 3: (Color online) The color code shows the fraction of realiza- 
tions in which the whole system is made up of cooperators, Fc, as a 
function of the velocity at which the agents move (v) and the temp- 
tation to defect (b). The Y-axis is in log scale for clarity. The rest 
of parameters are A*" — 10^ agents and p = 1.30. Each point is an 
average over 100 different realizations 




FIG. 2: Fraction of realizations in which the system ends up in an 
all-C configuration, Fc, as a function of the density of players p for 
a fixed value of 6 = 1.1 and v = 0.01. The system is made up of 
N — 10'^ agents. The results are averages taken over 100 different 
realizations. 



of cooperation (c) in a population of = 10"^ individuals as a 
function of time, for v = 0.01 and for two different values of 
b. Starting from a configuration in which individuals are co- 
operators or defectors with the same probability, the average 
level of cooperation slowly evolves to one of the two asymp- 
totic states: all-C or all-D. It is also worth stressing that the 
system reaches those states more slowly than in static settings 
(i.e., when v ~ 0). Specifically, it appears that the system 
spends a considerable time in metastable states (flat regions in 
the figure) that are followed by a sudden decrease (or increase) 
of the average level of cooperation. 

The evolution of the system depends on the density of play- 



ers. In Fig.|2] we have represented the dependence of the frac- 
tion of realizations, Fc in which the population ends up in an 
all-C configuration as a function of the density p for 6=1.1 
and V = 0.01. There are two limits for which Fc = 0. At 
low values of the density, the agents are too spread in the 2D 
plane. As a result, cooperators unsuccessfully strive to sur- 
vive and get extinguished given the low chance they have to 
form clusters -the only mechanism that can enforce their suc- 
cess. On the contrary, for large values of p the population is 
quite dense and, locally, the agents' neighborhoods resemble 
a well-mixed population in which more or less everybody in- 
teracts with everybody and therefore defection is the only pos- 
sible asymptotic state. Values of p between these two limiting 
cases confer to cooperators a chance to survive. Interestingly, 
there is a region of the density of players, 0.9 ^ p ^ 3 which 
is optimal for cooperative behavior Beyond this region Fc 
decays exponentially with p reaching zero at p « 7. 

Up to now, we have analyzed the behavior of the system for 
small values of the velocity of the agents and of the tempta- 
tion to defect. Figure [3] summarizes the results obtained for a 
wider range of model parameters {v and b) in a population of 
N = lO'^ agents and p = 1.3. The results are averages taken 
over 100 realizations of the model. The phase diagram shows 
a relative wide region of the model parameters in which coop- 
erative behavior survives. For a fixed value of v, this region 
is bounded by a maximum value of the temptation to defect 
close to 6 = 1.3, which decreases as the velocity at which 
players move increases. Furthermore, when b is kept fixed, in- 
creasing the value of v is not always beneficial for the survival 
of cooperation. In fact, when the individuals move too fast, 
they change their environment quite often and quickly, then 
increasing the likelihood to meet each time step a completely 
different set of players. In other words, when the velocity is 
increased beyond a certain value, the well-mixed hypothesis 
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of different strategies in the long time limit. Namely, for small 
(and fixed) values of b cooperation prevails at low velocities, 
while defection succeeds for larger v. Our results are rele- 
vant for the design of new cooperation-based protocols aimed 
at motion coordination among wireless devices and for other 
communication processes based on game theoretical models 
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FIG. 4: Fraction of realizations ending up in an all-C configuration 
as a function of the velocity v of the agents for b — 1.1. The inset 
shows the smallest value of the temptation to defect, be, for which the 
probability of achieving a fully cooperator asymptotic state is zero, 
as a function of v. In both cases, iV = 10^ agents, p = 1.30, and 
results correspond to averages over 100 realizations. 



applies to the whole population of players, thus leading to the 
extinction of cooperation in the long time limit. 

Figure|4]sheds more light on the dependence of the fraction 
of cooperators with respect to the velocity of the agents. There 
we have represented the layer corresponding to 6 = 1 . 1 in Fig. 
[3] As can be seen from the figure, for low values of v all the 
reaUzations lead the system to a configuration in which all 
strategists are cooperators. As the PD players move faster, the 
probability of achieving such a configuration decreases and 
gets zero for values of v close to 0.05. From that point on, 
the all-C asymptotic state is never realized. This latter point 
also depends on the specific value of b. The inset of Fig. [H 
represents the smallest values of the temptation to defect, be, 
for which in all the realizations performed the system ended 
up in the all defectors state as a function of v. The results 
show that beyond v « 0.1, cooperation never survives in a 
population of moving agents irrespective of b. 

In short, we have studied the effects of mobility on a pop- 
ulation of Prisoner's Dilemma players that are able to move 
in a two-dimensional plane. Numerical simulations of the 
model show that a fully cooperative system is sustained when 
both the temptation to defect and the velocity of the agents 
are not too high. Although cooperation is extinguished for 
a wide region of the parameter space, our results show that 
mobility have a positive effect on the emergence of cooper- 
ation. As a matter of fact, as soon as v ^ Q, the mobility 
of the agents provokes the spread of the winning strategy to 
the whole population, leading the system to a global attrac- 
tor in which all players share the surviving strategy. In other 
words, the movement of individuals prevents the coexistence 
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